Skip to content

Conversation

@joerunde
Copy link
Collaborator

This PR:

  • Updates the k8s example to add the required scheduler name and switch to a startup probe with a proper progress deadline
  • Adds a page with a kserve example for RHOAI usage

@joerunde joerunde requested a review from rafvasq as a code owner June 10, 2025 20:55
@github-actions
Copy link

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

@joerunde
Copy link
Collaborator Author

bot:test
MARKERS="embedding"

Copy link
Collaborator

@rafvasq rafvasq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll read through the RHOAI doc tomorrow, quick note that it's not in .nav.yml though!

@joerunde
Copy link
Collaborator Author

Thanks!
Also cc @tjohnson31415 what I don't know how to do with kserve is correctly setup the ports so that you can just curl the predictor service. On my cluster I can seemingly hit the service if I set the port manually, but I don't think I should have to do that.

This connects:

curl http://granite-3-1-8b-instruct-predictor.a1-vllm-spyre:8000/v1/completions

But this refuses a connection:

curl http://granite-3-1-8b-instruct-predictor.a1-vllm-spyre/v1/completions

(I'm probably just missing something simple)

Comment on lines 98 to 105
3. Deploy and Test

Apply the manifests using `oc apply -f <filename>`:

```console
oc apply -f servingruntime.yaml
oc apply -f inferenceservice.yaml
```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could just include these apply lines in each step above after defining the manifests and then use this step for a "Perform an inference request" example.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
I went ahead with the heredoc pattern in each section above (oc apply -f - <<EOF) and just linked out to the relevant kserve docs here for how to setup inference with a vllm deployment so we don't have to repeat that info

@prashantgupta24
Copy link
Collaborator

I think there was a way to fix the sign off commit suggestions directly from the PR page, but can't remember 🤔

Expected "Joe Runde [email protected]", but got "Joe Runde [email protected]".

@joerunde
Copy link
Collaborator Author

yeah that confuses me, since my regular commits (that pass DCO) are all signed off with [email protected] :(

Co-authored-by: Rafael Vasquez <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
@joerunde
Copy link
Collaborator Author

@prashantgupta24 for some reason github defaulted to my ibm email for the commit author, but my personal email for the signoff. 🤷

git commit --amend --reset-author
git push -f

fixes it

Copy link
Collaborator

@rafvasq rafvasq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My last nit, thanks Joe!

Co-authored-by: Rafael Vasquez <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
@rafvasq rafvasq merged commit d56423f into main Jun 17, 2025
19 checks passed
@rafvasq rafvasq deleted the rhoai-examples branch June 17, 2025 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants